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Abstract 



In this paper we study general Ip regularized unconstrained minimization problems. 
In particular, we derive lower bounds for nonzero entries of first- and second-order sta- 
tionary points, and hence also of local minimizers of the Ip minimization problems. We 
extend some existing iterative reweighted li (IRLi) and I2 (IRL2) minimization methods 
to solve these problems and proposed new variants for them in which each subprob- 
lem has a closed form solution. Also, we provide a unified convergence analysis for 
these methods. In addition, we propose a novel Lipschitz continuous e-approximation to 
II X lip. Using this result, we develop new IRLi methods for the Ip minimization problems 
and showed that any accumulation point of the sequence generated by these methods 
is a first-order stationary point, provided that the approximation parameter e is below 
a computable threshold value. This is a remarkable result since all existing iterative 
reweighted minimization methods require that e be dynamically updated and approach 
zero. Our computational results demonstrate that the new IRLi method is generally 
more stable than the existing IRLi methods |2H 118] in terms of objective function value 
and CPU time. 

Key words: Ip minimization, iterative reweighted h minimization, iterative reweighted 
I2 minimization 



1 Introduction 

Recently numerous optimization models and methods have been proposed for finding sparse 
solutions to a system or an optimization problem (e.g., see [281 [El El E 1211 IS IHl [lOl ttSl IMl 
I2TI ini [H |23l [301 [3TI |26l [32]). In this paper we are interested in one of those models, namely, 
the Ip regularized unconstrained nonlinear programming model 
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for some A > and p G (0,1), where / is a smooth function with L/-Lipschitz-continuous 
gradient in K", that is, 

l|V/(x) - V/(y)||2 < Lf\\x - yh, Vx,y G 3fJ", 

and / is bounded below in 3ft". Here, ||a;||p := (^2^=1 l^^il^)^''^ for any x G 'Si"'. One can observe 
that as p J, 0, problem ([1]) approaches the /q minimization problem 

min /(x) + A||x||o, (2) 

which is an exact formulation of finding a sparse vector to minimize the function /. Some 
efficient numerical methods such as iterative hard thresholding [5j and penalty decomposition 
methods [26] have recently been proposed for solving ([2]). In addition, as p ^ 1, problem ([1]) 
approaches the li minimization problem 

min f(x) + Allxlli, (3) 

which is a widely used convex relaxation for ([2]). When / is a convex quadratic function, model 
([3]) is shown to be extremely effective in finding a sparse vector to minimize /. A variety of 
efficient methods were proposed for solving ([3]) over last few years (e.g., see [29 t [Tl [23 1 I30 1 I3T] ) . 
Since problem ([1]) is intermediate between problems ([2]) and ([3]), one can expect that it is 
also capable of seeking out a sparse vector to minimize /. As demonstrated by extensive 
computational studies in [11], problem ([1]) can even produce a sparser solution than ([3]) does 
while both achieve similar values of /. 

A great deal of effort was recently made by many researchers (e.g., see [HI [T^l HSl ESI 
[IIl[l8l[20l[l5l[22l[271[2l[T6])for studying problem © or its related problem 

min{||x||?? : Ax = 6|. (4) 

In particular, Chartrand [11], Chartrand and Staneva [12], Foucart and Lai [21], and Sun [27J 
established some sufficient conditions for recovering the sparest solution to a undetermined 
linear system Ax = 6 by the model (jl]). Efficient iterative reweighted li (IRLi) and I2 (IRL2) 
minimization algorithms were also proposed for finding an approximate solution to (jlj) by 
Foucart and Lai [21] and Daubechies et al. [20] , respectively. Though problem (jl]) is generally 
NP hard (see [151 122]), it is shown in [211 [2Q] that under some assumptions, the sequence 
generated by IRLi and IRL2 algorithms converges to the sparest solution to the above linear 
system, which is also the global minimizer of (HI). In addition, Chen et al. [IT] considered a 
special case of problem with f{x) = \\\Ax — b\\2, namely, the problem 

min -lUx - 6II2 + A||x|l?;. (5) 

They derived lower bounds for nonzero entries of local minimizer s of ([5|) and also proposed a 
hybrid orthogonal matching pursuit-smoothing gradient method for solving ([5|). Since ||x||^ is 
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non-Lipschitz continuous, Chen and Zhou [18] recently considered the following approximation 
to dS]): 

mmj^\\Ax-h\\l + \Y,{\x,\+ey 

^ i=l 

for some small e > 0. And they also proposed an IRLi algorithm to solve this approximation 
problem. Recently, Lai and Wang [25] considered another approximation to ([5]), which is 

1 " 
min-Px-6||^ + A^(|x,p + er^ 

1=1 

and proposed an IRL2 algorithm for solving this approximation. Very recently, Bian and 
Chen [2] and Chen et al. ^16j proposed a smoothing sequential quadratic programming (SQP) 
algorithm and a smoothing trust region Newton (TRN) method, respectively, for solving a 
class of nonsmooth nonconvex problems that include ([T]) as a special case. When applied to 
problem ([1]), their methods first approximate |a;|^ by a suitable smooth function and then 
apply an SQP or a TRN algorithm to solve the resulting approximation problem. Lately, 
Bian et al. [3] proposed first- and second-order interior point algorithms for solving a class 
of non-Lipschitz and nonconvex minimization problems with hounded box constraints, which 
can be suitably applied to Ip regularized minimization problems over a compact box. 

In this paper we consider general Ip regularized unconstrained optimization problem ([1]). In 
particular, we first derive lower bounds for nonzero entries of first- and second-order stationary 
points, and hence also of local minimizers of ([1]). We then extend the aforementioned IRLi 
and IRL2 methods [2T1 [201 EH [IB] to solve ([T]) and propose some new variants for them. We 
also provide a unified convergence analysis for these methods. Finally, we propose a novel 
Lipschitz continuous e-approximation to \\x\\'^ and also propose a locally Lipschitz continuous 
function F^{x) to approximate F{x). Subsequently, we develop IRLi minimization methods for 
solving the resulting approximation problem m.va.x&'Si" F^{x). We show that any accumulation 
point of the sequence generated by these methods is a first-order stationary point of problem 
([T]), provided that e is below a computable threshold value. This is a remarkable result since 
all existing iterative reweighted minimization methods for Ip minimization problems require 
that e be dynamically updated and approach zero. 

The outline of this paper is as follows. In Subsection 11.11 we introduce some notations 
that are used in the paper. In Section [2] we derive lower bounds for nonzero entries of sta- 
tionary points, and hence also of local minimizers of problem ([T]). We also propose a locally 
Lipschitz continuous function F^{x) to approximate F{x) and study some properties of the 
approximation problem mina;gfffn In Section [3|, we extend the existing IRLi and IRL2 

minimization methods from problems (jl]) and ([5]) to general problems ([1]) and propose new 
variants for them. We also provide a unified convergence analysis for these methods. In 
Section H] we propose new IRLi methods for solving ([T]) and establish their convergence. In 
Section we conduct numerical experiments to compare the performance of several IRLi 
minimization methods that are studied in this paper for ([1]). Finally, in Section [6] we present 
some concluding remarks. 
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1.1 Notation 



The set of all n- dimensional positive vectors is denoted by 3ft"_,_. Given any x G 3ft" and 
a scalar r, \xY denotes an n- dimensional vector whose ith component is \xiY . In addition, 
Diag(a;) denotes an n x diagonal matrix whose diagonal is formed by the vector x. Given an 
index set C {1, . . . , n}, denotes the sub- vector of x indexed by B. Similarly, Xgg denotes 
the sub-matrix of X whose rows and columns are indexed by B. In addition, if a matrix X is 
positive semidefinite, we write X ^ 0. The sign operator is denoted by sgn, that is, 

(I if t > 0, 

sgn(t)= <^ [-1,1] ift = 0, 

—1 otherwise. 

Finally, for any /3 < 0, we define 0^ = oo. 

2 Technical results 

In this section we derive lower bounds for nonzero entries of stationary points, and hence 
also of local minimizers of problem ([T]). We also propose a nonsmooth but locally Lipschitz 
continuous function F^{x) to approximate F{x). Moreover, we show that when e is below a 
computable threshold value, a certain stationary point of the corresponding approximation 
problem minj-gs^n F^{x) is also that of ([1]). This result plays a crucial role in developing new 
IRLi methods for solving ([T]) in Section HI 

2.1 Lower bounds for nonzero entries of stationary points of ([T]) 

The first- and second-order stationary points of problem ([1]) are defined in [IT]. We first 
review these definitions. Then we derive lower bounds for nonzero entries of the stationary 
points, and hence also of local minimizers of problem ([T]). 

Definition 1 Let x* he a vector in 3ft" and X* = Diag(x*). x* G 3ft" is a first-order stationary 
point of ([1]) if 

X*Vfix*) + \p\x*\P = 0. (6) 
In addition, x* G 3ft" is a second-order stationary point of ([1]) if 

{X*fV^f{x*)X* + \p{p-l)Diag{\x*\P) y 0. (7) 

Similar to general unconstrained smooth optimization, we can show that any local mini- 
mizer of is also a stationary point that is defined above. 

Proposition 2.1 Let x* be a local minimizer of ([T]) and X* = Diag(x*). The following 
statements hold: 

(i) X* is a first- order stationary point, that is, ([6]) holds at x* . 
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(a) Further, if f is twice continuously differentiable in a neighborhood of x* , then x* is a 
second-order stationary point, that is, ([7]) holds at x* . 

Proof, (i) Let B = {i : x* ^ Q}. Since x* is a local minimizer of ([T]), one can observe that 
X* is also a local minimizer of 

min{/(x) + A||x^5||P:a;, = 0, liB). (8) 

xg5R" 

Note that the objective function of ([8]) is differentiable at x* . The first-order optimality 
condition of ([8]) yields 

+ M<r^ sgn(x*) = 0, Vz G B. 

OXi 

Multiplying by x* both sides of the above equality, we have 

x*^^^ + \p\xl\^ = MieB. 

Since x* = for i ^ i3, we observe that the above equality also holds for i ^ B. Hence, (EI) 
holds. 

(ii) By the assumption, we observe that the objective function of ([8]) is twice continuously 
differentiable at x*. The second-order optimality condition of ([8]) yields 

V|H/(x*) + Ap(p-l)Diag(|x^r2) y 0, 

which, together with the fact that X* = Diag(a;*) and x* = for i ^ B, implies that ([7]) holds. 



Recently, Chen et al. [IT] derived some interesting lower bounds for the nonzero entries 
of local minimizers of problem ([1]) for the special case where f{x) = ^\\Ax — for some 
A G 3fj™x" and b G 3?*". We next establish similar lower bounds for the nonzero entries of 
stationary points, and hence also of local minimizers of general problem ([1]). 

Theorem 2.2 Let x* be a second-order stationary point of ([T]) and B = {i : x* ^ Q}. 
Suppose that f is twice continuously differentiable in a neighborhood of x* . Then the following 
statement holds: 

k-l > (Mlzrt)*. (9) 

Proof. Since / is twice continuously differentiable in a neighborhood of x* and / has 
Lj-Lipschitz-continuous gradient in SfJ", we see that ||V^/(x*)||2 < -^/- In addition, since x* 
satisfies (jTj), we have 

en(X*)^VV(a:*)X*]e. + Ap(p- l)efDiag(|x*r-2)]e. > 0, 
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where is the ith coordinate vector. It then follows that for each i G B, 

[V'f{x*)]u + Xp{p-l)\x*r'>0, 

which yields 



Theorem 2.3 Let x* be a first-order stationary point satisfying F{x*) < + e for some 

x° G 3?"" and e > 0, and let f = mix(^^" f{x) and B = {i : x* ^ 0}. Then the following 
statement holds: 

1 

Xp 



I * I \ 
\xA > 



2L^[F(xO) + e-/] 



\fi e B. 



Proof. Since / has Lj-Lipschitz-continuous gradient in 3ft", it is well known that 
f{y) < fix) + Vfixfiy -x) + ^\\y~ xg, Vx, y G 3?^ 



Letting x = x* and y = x* — V f{x*)/Lf, we obtain that 



/(x*-V/(x*)/L^) < fix*) 



2L 



■||V/(x 



'I2 



Note that 



/(x* - Vf{x*)/Lf) > inf /(x) = /, fix*) < Fix*) < Fix') + e. 



Using these relations and f lTT]) . we have 



||V/(x*)||2 < ^2L^[/(x*)-/(x*-V/(x*)/L;)] < ^/2Lj[Fix^) + e-l]. 
Since x* satisfies (jH]), we obtain that for every i E B, 

dfix*) 



\Xi 



1 

Xp 



dxi 



> 



|^ ||V/(X*)||2 >|^ 



which together with ( IT^ yields 



(10) 



(11) 



12) 



I * I -\ 

X, > 



2L;[F(xO) + e-/] 



G -B. 
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2.2 Locally Lipschitz continuous approximation to ([T]) 

It is known that for p G (0, 1), the function is not locally Lipschitz continuous at some 
points in 3ft" and the Clarke sub differential does not exist there (see, for example, [IT]). This 
brings a great deal of challenge for designing algorithms for solving problem ([T]). In this subsec- 
tion we propose a nonsmooth but Lipschitz continuous e- approximation to for every e > 0. 
As a consequence, we obtain a nonsmooth but locally Lipschitz continuous e-approximation 
F^{x) to F{x). Furthermore, we show that when e is below a computable threshold value, 
a certain stationary point of the corresponding approximation problem minajgsRn F^{x) is also 
that of dH). 

Lemma 2.4 Let u > be arbitrarily given, and let q be such that 

11 

- + - = 1. 13 

p q 

Define 

hu{t) := min p (^\t\s - , Vt G 3ft. (14) 

Then the following statements hold: 
(i) < hu{t) - \t\P < M« for every t G 3ft. 
(ii) hu is pu-Lipschitz continuous in (— cxd,oo), i.e., 

\hu{ti) - hu{t2)\ <pu\ti-t2\, Wti,t2e^. 

(Hi) The Clarke subdifjerential of hu, denoted by dhu, exists everywhere, and it is given by 

io|t|P"^ sgn(t) Itl > 
dhu{t) = { (15) 
pu sgn(t) if \t\ < u'^ 

Proof (i) Let gt{s) = p{\t\s — s'^/q) for s > 0. Since p G (0, 1), we observe from ( IT3!) that 
q < 0. It then implies that gt{s) — )■ oo as s J, 0. This together with the continuity of gt implies 
that hu{t) is well-defined for all t G 3ft. In addition, it is easy to show that gt{-) is convex in 
(0, oo), and moreover, inf gt{s) = \t\P. Hence, we have 

huit) = min gt{s) > ini gt{s) = \tf, Vt G 3ft. 

0<s<u s>0 

We next show that hu{t) — \t\'^ < u'^ by dividing its proof into two cases. 

1) Assume that |t| > u'^~^. Then, the optimal value of ([T3j) is achieved at s* = |t|~ and 
hence. 



Kit) = p ( \t\s* - ^ ) = |t| 
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2) Assume that \t\ < u"-^. It can be shown that the optimal value of f|T^ is achieved at 
s* = u. Using this result and the relation \t\ < u'^~^, we obtain that 



hu{t) = p (^\t\u -^j<p yu'^'^u - y j = w'' 

which implies that hu{t) — \t\^ < hu{t) < u'^. 
Combining the above two cases, we conclude that statement (i) holds, 
(ii) Let : [0, oo) — )■ 3fJ be defined as follows: 

tP if t > M^-i, 



It is not hard to see that 



pitu-u'i/q) ifO<t<M''-^ 



f ptP-^ lit > u'^~\ 
(j)'(t) = i (16) 
[ pu if < t < ui-\ 

Hence, < 4>'{t) < pu for every t G [0, oo), which implies that (j) is pw-Lipschitz continuous on 
[0, oo). In addition, one can observe from the proof of (i) that hu{t) = </'(|t|) for all t. By the 
chain rule, we easily conclude that hu is pw-Lipschitz continuous in (— oo, oo). 

(iii) Since hu is Lipschitz continuous everywhere, it follows from Theorem 2.5.1 of [19] that 



dhu{t) = cov<^ lim h'^{tk) \ 



(17) 



where cov denotes convex hull and D is the set of points at which is differentiable. Recall 
that hu(t) = 0(|t|) for all t. Hence, h'^it) = 0'(|t|) sgn(t) for every t 7^ 0. Using this relation, 
( !T6|) and ( !T7|) . we immediately see that statement (iii) holds. ■ 



Corollary 2.5 Let u > be arbitrarily given, and let h{x) = f^ui^i) for every x G 3?"', 
where h^ is defined in f ll4p . Then the following statements hold: 

(i) < h{x) — ||a;||^ < nu'^ for every x G 3?*^. 

(ii) h is y/npu-Lipschitz continuous in 3ft", i.e., 

\\h{x) - h{y)\\2 < Vnpu\\x - y\\2, 'ix,y. 

(iii) The Clark subdifferential of h exists at every a; G 3ft". 



We are now ready to propose a nonsmooth but locally Lipschitz continuous e-approximation 
to F{x). 
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Proposition 2.6 Let e > be arbitrarily given. Define 



n 



(18) 



i=l 



where 



KM ■■ 



min p ( \t\s 

o<s<ue \ q 



) 




(19) 



Then the following statements hold: 
(i) < F^{x) — F{x) < e for every x G 3?". 

(a) F^ is locally Lipschitz continuous in 3ft". Furthermore, if f is Lipschitz continuous, so 
zs F,. 

(Hi) The Clark subdifferential of F^ exists at every a; G Sft". 

Proof. Using the definitions of F^ and F, we liave F^{x) — F{x) = X{J2i=i huX^i) ~ ll^llp); 
wliicli, togetlier witfi Corollary 12.51 (i) with u = Ue, implies that statement (i) holds. Since / 
is differentiable in 9ft", it is known that / is locally Lipschitz continuous. In addition, we know 
from Corollary 12.51 (ii) that XlILi huX^i) is Lipschitz continuous in Sft". These facts imply that 
statement (ii) holds. Statement (iii) immediately follows from Corollary 12.51 (iii). ■ 

From Proposition 12. 6[ we know that F^ is a nice e-approximation to F. It is very natural 
to find an approximate solution of ([1]) by solving the corresponding e-approximation problem 



where F^ is defined in ( ITSl) . Strikingly, we can show that when e is below a computable 
threshold value, a certain stationary point of problem (120|) is also that of ([I]). 

Theorem 2.7 Let x^ G 3ft" be an arbitrary point, and let e be such that 



where f = infa-gjfn f{x). Suppose that x* is a first-order stationary point of (120!) such that 
that F^x*) < Fe(x°). Then, x* is also a first- order stationary point of ([1]), i.e., holds at 
X* . Moreover, the nonzero entries of x* satisfy the first- order lower bound (fTOj) . 

Proof. Let B = {i : x* ^ 0}. Since x* is a first-order stationary point of ( l20l) . we have 
G dF^x*). Hence, it follows that 




(20) 




(21) 



dfjx*) 
dxi 



+ XdhuAx:) 



0, Ml G B. 



(22) 
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In addition, we notice that 



fix*) < F{x*) < F,{x*) < < F(x°) + e. 

This relation together with (122|) and (fT2l) imphes that 



(23) 



df{x* 



dxi 



< 



2Lf[F{x^) + e-l] 
A ' 



Ml e B. 



(24) 



We now claim that \x*\ > u^'^'^ for all i G B, where is defined in (fT9l) . Suppose for 
contradiction that there exists some i & B such that < < u^'^~^. It then follows from 
( 1T5]) that \dhu^{x*)\ = pu^. Using this relation, ( ETj) and the definition of u^, we obtain that 



\dhuAx* 



e \V9 



> 



2Lf[F{x^ 



f] 



A 



which contradicts (HM . Therefore, |x*| > u^'^ ^ for all i E B. Using this fact and ( fT5l) . we see 

sgn(a;*) for every i & B. Substituting it into (!22|) . we obtain that 



that dhuXK\ 



p\x 



* Ip— 1 



df{x* 
dxi 



+ \p\x. 



* ip— 1 



sgn(Xi 



0, G B. 



Multiplying by x* both sides of the above equality, we have 

.df{x*) 



dxi 



+ Ap|x*|^' = 0, MieB. 



Since x* = for i ^ B, we observe that the above equality also holds for i ^ B. Hence, ([6]) 
holds. In addition, recall from (123|) that F{x*) < F{x^) + e. Using this relation and Theorem 
12. 3[ we immediately see that the second part of this theorem also holds. ■ 



Corollary 2.8 Let x^ G 3?" he an arbitrary point, and let e he such that ( 121]) holds. Suppose 
that X* is a local minimizer of fl20|) such that F^x*) < F^x'^). Then the following statements 
hold: 

i) X* is a first- order stationary point of ([T]), i.e., holds at x* . Moreover, the nonzero 
entries of x* satisfy the first-order lower bound ffTOl) . 

a) Suppose further that f is twice continuously differentiable in a neighborhood of x* . Then, 
X* is a second-order stationary point of ([T]), i.e., ([7]) holds at x* . Moreover, the nonzero 
entries of x* satisfy the second-order lower bound ([9]). 
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Proof, (i) Since x* is a local minimizer of ( 120|) . we know that x* is a stationary point of 
( 120|) . Statement (i) then immediately follows from Theorem 12.71 

(ii) Let B = {i\x* ^ 0}. Since x* is a local minimizer of (|20l) . we observe that x* is also a 
local minimizer of 

mill \ fix) + A 5^ : = 0, z ^ S I . (25) 

I ieB J 

Notice that x* is a first-order stationary point of ( 120|) . In addition, F(x*) < /^(x*^) +e and e 
satisfies ([21]) • Using the same arguments as in the proof of Theorem \2.7\ we have \x*\ > u^'^~^ 
for all i B. Recall from the proof of Lemma [2.41 (i) that h^X^) = 1^1^ if \A > Hence, 
X] huX^i) = S l^jP for a; in a neighborhood of x*. This, together with the fact that x* 

is a local minimizer of f l25|) . implies that x* is also a local minimizer of ([8]). The rest of the 
proof is similar to that of Proposition 12.11 and Theorem 12.21 ■ 



3 A unified analysis for some existing iterative reweighted 
minimization methods 

Recently two types of IRLi and IRL2 methods have been proposed in the literature [2T1 [201 
ITS] for solving problem ([!]) or ([S]). In this section we extend these methods to solve ([T]) 
and also propose a variant of them in which each subproblem has a closed form solution. 
Moreover, we provide a unified convergence analysis for them. 



3.1 The first type of IRL^ methods and its variant for ([T]) 

In this subsection we consider the iterative reweighted minimization methods proposed in 
[25| [18] for solving problem ([5]), which apply an IRLi or IRL2 method to solve a sequence of 



problems min Qi,efc(x) or min Q2,ek{x), where {e } is a sequence of positive vectors approach- 
ing zero as — )■ 00 and 

Q„,,(x) := ^\\Ax - b\\l + xJ2{\x^r + e.)"- (26) 

i=l 

In what follows, we extend the above methods to solve ([1]) and also propose a variant of 
them in which each subproblem has a closed form solution. Moreover, we provide a unified 
convergence analysis for them. Our key observation is that problem 

n 

min{F„,,(x) := /(x) + A^^dx,!" + e,)^} (27) 

i=l 

for a > 1 and e > can be suitably solved by an iterative reweighted la (IRL^) method. 
Problem ([1]) can then be solved by applying the IRL^ method to a sequence of problems ([2 
with e = e'^ — )■ as — )■ 00. 
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We start by presenting an IRL^ method for solving problem (127|) as follows. 

An IRLq minimization method for ( |27ll : 

Choose an arbitrary Set A; = 0. 

1) Solve the weighted la minimization problem 



e Argmin<j /(x) + ^^sflx^r !> , (28) 



n 



4 = 1 



where = |" + e^) « ^ for all i. 
2) Set /c <^ /c + 1 and go to step 1). 
end 

We next show that any accumulation point of {x^} generated above is a first-order sta- 
tionary point of ([2j 



Theorem 3.1 Let the sequence {x''} be generated by the above IRL^ minimization method. 
Suppose that x* is an accumulation point of {x^}. Then x* is a first-order stationary point of 

dZZD. 

Proof. Let q be such that 

^ + 1 = 1. (29) 

p q 

It is not hard to show that for any 6 > 0, 

(|tr + 5)f = ^min|(|tr + 5)s- -1, yte^, (30) 
a s>o q j 

and moreover, the minimum is achieved at s = (|t|° + 5)~. Using this result, the definition 
of s^ and (EHD, one can observe that for A; > 0, 

s'' = argminG„,e(a;'', s), a;''^-^ € Argmin (^^^.(x, s''), (31) 



where 

Xp 

a 



Ga,e{^,s) = f{x) + — 52 



(32) 



In addition, we see that F^^Jyx^^ = Ga,e{x^, s^)- It then follows that 

F,,,(x^+i) = GaAx'^\s''+') < GaA^'^^^s') < G,,.(x^s^) = F„,,(x'=). (33) 

Hence, {F^^eix'')} is non-increasing. Since x* is an accumulation point of {x'^}, there exists 
a subsequence K such that {x^}k x*. By the continuity of F^^^, we have {Fa^e{x'')}K 
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Fa,t{x*), which together with the monotonicity of F^^^^x'') imphes that Fo,^^{x'^) — )■ Fa,^{x*). 
In addition, by the definition of s^, we have {s'^}k -s*, where s* = (|a;*|° + ej)^^^ for all i. 
Also, we observe that Fa,^{x*) = Ga,e{x*, s*). Using (j33l) and Fa,e(a;'') Fa^^{x*), we see that 
GaA^^^^s'') Fa,,{x*) = GaA^\s*). Further, it follows from' (E]) that 

Upon taking limits on both sides of this inequality as /c G -ft' — > cxd, we have 

that is, X* G Arg min Gaei^, s*), which, together with the first-order optimality condition and 
the definition of x* , yields 

Oe^^ + Xp{\x*r + e,)^-'\x:r'sgn{x*), Vz. (34) 

Hence, x* is a stationary point of ( 1271] . ■ 

The above IRL^ method needs to solve a sequence of reweighted minimization problems 
(147|) whose solution may not be cheaply computable. We next propose a variant of this method 
in which each subproblem is much simpler and has a closed form solution for some a's (e.g., 
a = 1 or 2). 

A variant of IRL^ minimization method for ( 1271) : 

Let < Ljnin < -^max5 T > 1 and c > be given. Choose an arbitrary x° and set k = 0. 
1) Choose L° G [i^min, -^^max] arbitrarily. Set Lk = L^. 
la) Solve the weighted minimization problem 

x'^+i e Arg min J f{x'') + Vfix'^fix - x'') + ^\\x - x'^g + ^ V s^\xi\ 
X \ 2 a ^-^ 

K. 1=1 

(35) 

where = (jxf |" + ei)^ ^ for all i. 



lb) If 



Fa,.(x'=) - F„,,(x'=+i) > |||a:'=+i -xX (36) 



is satisfied, where F^^^ is given in ( 127|) . then go to step 2). 
Ic) Set Lk ^ rLfc and go to step la). 

2) Set <(— /c + 1 and go to step 1). 
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end 

We first show that for each outer iteration, the number of its inner iterations is finite. 



Theorem 3.2 For each k > 0, the inner termination criterion (13611 is satisfied after at most 

log(L,+c)-log(2L.„,0 ^1 
logr 

Proof. Let denote the final value of L^. at the kth outer iteration. Since the objective 
function of (135!) is strongly convex with modulus L^, we have 

, n , n 



+ ^ y r > fix'') + vfixYix""^^ - ^'o + — y + ^^11^"' 

1=1 1=1 
Recall that V/ is L/-Lipschitz continuous. We then have 

< f{x') + Vf{x''f{x'^'-x') + ^\\x'+'-x''\\l. (37) 
Combining the above two inequalities, we obtain that 



i = l " i=l 



which together with f l32|) yields 



Using these relations and the above inequality, we obtain that 



Recall that Fa^^{x^) = Ga,e{x^^ s^). In addition, it follows from (130|) that Fa^^{x) = minGa,e(x, s). 

s>0 



i3y I I O • 



Hence, fl36|) holds whenever > (-^v/ + c)/2, which together with the definition of implies 
that Lk/r < {Lf + c)/2, that is, Lk < T{Lf + c)/2. Let denote the number of inner 
iterations for the fcth outer iteration. Then, we have 



L^inr""'-' < Llr"'^-^ = I, < t{Lj + c)/2. 



Hence, < 



log(Lf +c)-log(2Lmin) _|_ 2 

logr 



and the conclusion holds. 



We next establish that any accumulation point of the sequence {x^} generated above is a 
first-order stationary point of problem (!27jl . 
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Theorem 3.3 Let the sequence {x''} be generated by the above variant o/IRL^ method. Sup- 
pose that X* is an accumulation point of {x^}. Then x* is a first-order stationary point of 



Proof It follows from (EE]) that {Fa,e{x'')} is non-increasine:. Since x* is an accumulation 
point of {x^}, there exists a subsequence K such that {x''}k — ^ x*. By the continuity of Fa,^, 
we have {Fa,^{x'')} k Fa^^{x*), which together with the monotonicity of {Fa^^{x'')} implies 
that Fa^^{x^) — )■ Fa,^{x*). Using this result and (136|) . we can conclude that — x^\\ — )• 0. 

Let Lfc denote the final value of at the fcth outer iteration. From the proof of Theorem 
13. 3[ we know that Lj. G [Lmm, ''"(-^z + c)/2). The first-order optimality condition of (!35l) with 
Lfc = Ifc yields 

e + Lkix'l+^ - x'l) + Xps'llx'^+^l'^'^ sgn(x,^+i) = 0, V^. 

OXi 

Upon taking limits on both sides of the above equality as G -fC — ?■ oo, we have 

Oe^^ + Xps*\x:r'sgn{x*), Vz, 
where s* = {\x*\ + ej)«~^ for all i. Hence, x* is a first-order stationary point of (|2 



Corollary 3.4 Let 6 > be arbitrarily given, and let the sequence {x''} be generated by the 
above IRLq, method or its variant. Suppose that {x*^} has at least one accumulation point. 
Then, there exists some x^ such that 

||X'=V/(x'=) + Ap|XY(|a;T + e)""i < 5, 
where = Diag(a;*^) and \X''\°' = Diag(|a;^|"). 

Proof. Let x* be an arbitrary accumulation point of {x'^}. It follows from Theorem 13.11 
that X* satisfies (IMIl . Multiplying by x* both sides of (IMl) . we have 

x*^^ + \p{\x:r + e.)^~'\xT = V*, 

which, together with the continuity of V/(x) and implies that the conclusion holds. ■ 

We are now ready to present the first type of IRLq, methods and its variant for solving 
problem ([1]) in which each subproblem is in the form of (127|) and solved by the IRLq, or its 
variant described above. The IRLi and IRL2 methods proposed in [251118] can be viewed as the 



special cases of the following general IRL^ method (but not its variant) with f{x) = ^\\Ax—b\\2 
and a = 1 or 2. 

The first type of IRLq, minimization methods and its variant for ([1]): 

Let {6k} and {e^} be a sequence of positive scalars and vectors, respectively. Set k = 0. 
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1) Apply the IRL^ method or its variant to problem (127|) with e = e'^ for finding satisfying 

||X'=V/(x'=) + Ap|XY(|a;Y + e'')""i < 4, (38) 
where X'' = Diag(x'') and |X''|" = Diag(|x''|"). 

2) Set /c <(— /c + 1 and go to step 1). 
end 



The convergence of the above IRLq method and its variant is established as follows. 



Theorem 3.5 Let {6k} and {e'^} be a sequence of positive scalars and vectors such that 
{5k} — ^ and {e*^} — )■ 0, respectively. Suppose that {x^} is a sequence of vectors generated 
above satisfying (!38|) . and that x* is an accumulation point of {x'^}. Then x* is a first-order 
stationary point of ([1]), i.e., ([6]) holds at x* . 



Proof Let B = {i : x* ^0}. It follows from ([38D that 



— h Ap\Xi\ {\Xi\ +ej- 



dxi 



<5k Vz G B. 



(39) 



Since x* is an accumulation point of {x^}, there exists a subsequence K such that {x^}k x* . 
Upon taking limits on both sides of (l39ll as /c G -ftT — )■ cxd, we obtain that 



Vz G -B. 



Since = for i ^ B, we observe that the above equality also holds for i ^ B. Hence, x* 
satisfies ([6]) and it is a first-order stationary point of ([1]). ■ 



3.2 The second type of IRL^ methods and its variant for ([Tj) 

In this subsection we are interested in the IRLi and IRL2 methods proposed in [211 120] for 
solving problem @. Given {e*^} C 3f?"_,_ — )■ as — )■ 00, these methods solve a sequence 
of problems minQi ,fe(x) or min Qo ^k(x)) extremely "roughly" by executing IRLi or IRL2 

method only for one iteration for each e'^, where Qa,e is defined in ( 126|) . 

We next extend the above methods to solve ([1]) and also propose a variant of them in 
which each subproblem has a closed form solution. Moreover, we provide a unified convergence 
analysis for them. We start by presenting the second type of IRL^ methods for solving ([T]) as 
follows. They evidently become an IRLi or IRL2 method when a = 1 or 2. 

The second type of IRLq minimization method for ([1]) : 

Let {e'^} be a sequence of positive vectors in 3?". Choose an arbitrary Set A; = 0. 
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1) Solve the weighted la minimization problem 

^fe+i g Argmin <( f{x) + — > ^ sf\xi\" } , (40) 




where = (|xf |° + )^~^ for all i. 
2) Set k ^ k + 1 and go to step 1). 
end 

We next establish that any accumulation point of {x'^} is a stationary point of ([1]). 

Theorem 3.6 Suppose that {e''} is a sequence of non-increasing positive vectors in 3ft" and 
e'^ — )■ as k ^ oo. Let the sequence {x^} be generated by the above IRLq, method. Suppose 
that X* is an accumulation point of {x^}. Then, x* is a stationary point of ([T]). 

Proof. Let G'a,e(-, ■) be defined in (!32|) . By the definition of one can observe that 

Ga,e*'{.^^^^ 1 s^) ^ Ga,^k{x^ , s^). Also, by the definition of s^^^ and a similar argument 
as in the proof of Theorem 13.11 we have Gae'^+'^i.^^^^^^^^) = mi G^e'^+^i^'^'^^^s). Hence, 

' s>0 ' 

Ga,ef'+^{x''~^^, s''^^) < Ga^^k+i{x^~^^, s''). In addiitou, since > and {e'^} is component- 
wise non-increasing, we observe that Ga^e'^+^^i^^'^^, 4) < Ga^e^^i^^^^ , s^). Combining these 
inequalities, we have 

G„,,.+i(x'=+\s^+^) < Ga,,k+i{x^^^,s^) < G'„,,.(x^+\s'=) < G'^,,fe(a;^s^), VA; > 0. (41) 

Hence, {G^^e^ix^ ■, s^)} is non-increasing. Since x* is an accumulation point of {x^}, there 
exists a subsequence K such that {x^}k ■ By the definition of , one can verify that 
Ga,^k{x^, s^) = f (x^) + XYA=ii\^i\°' + ^i)" ■ It then follows from {x''}k x* and ^ that 
{Ga,e'^{x'',s'')}K f (x*) + A||a;*||P. This together with the monotonicity of {Ga^e'^ix'' , s^)} 
implies that G^^ei^ix^^ 4) — )■ f{x*) + A||x*||^. Using this relation and (HTj) . we further have 

Ga,e>'{x''+\s'') ^ fix*) + \\\x*\\P. (42) 
Let B = {i : X* ^ 0} and B be its complement in {1, ... , n}. We claim that 

X* G Argmin { fix) + — V IxiTlx*!^"" I . (43) 
^^=° I " J 

Indeed, using the definition of , we see that {s\}k — ^ Vi G B. Further, due to 

s'^ > and g < 0, we observe that 
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which, together with e'^ — )■ and {xf j/^- — )■ for i G i3, imphes that 



hm } 



0. 



(44) 



In addition, by the definition of x^^^, we know that Ga^e^^^x, s^) > Ga^e'^ix''^^, s'^). Then for 
every x E such that Xg = 0, we have 



(ix.r+6f).^ 



a ^ 



^k\q 



Upon taking hmits on both sides of this inequahty as k E K oo, and using fl42|) . fl44j) and 
the fact that {s^jx -> \x*\'p~°, Wi e B, we obtain that 



u.|"l^*|p-" 

I I I 2 I 



* 1 



> /(x*) + A||x*||^ 



for all X G 3ft" such that xg = 0. This inequality and ( 129|) immediately yield ( H3|) . It then 
follows from (12^ and the first-order optimality condition of that 

df(x*) 

x*-i^ + Xp\x*\P = V^gB. 

CXj 

Since x* = for i G i3, we observe that the above equality also holds for i E B. Hence, x* 
satisfies (jH]) and it is a stationary point of ([T]). ■ 

Notice that the above IRL^ method requires solving a sequence of reweighted 1^ minimiza- 
tion problems ( HOj) whose solution may not be cheaply computable. We next propose a variant 
of this method in which each subproblem is much simpler and has a closed form solution for 
some a's (e.g., a = 1 or 2). 

A variant of the second type of IRL^ minimization method for ([T]) : 

Let {e^} be a sequence of positive vectors in Sft", and let < Lmin < i^max, t > 1 and c > 
be given. Choose an arbitrary x°. Set = 0. 



1) Choose L° G [Lmm, -Z^max] arbitrarily. Set Lk = 
la) Solve the weighted 1^ minimization problem 



x'^+i G Argmin <^ /(x'=) + V/(x'=)^(x-x'=) + -^||x-x'=||^ + — Vs 
where = (|xf|" + for all i 



(45) 
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lb) If 



FaAA - F.,e>^Ax'^') > - All (46) 



is satisfied, then go to step 2). 
Ic) Set Lk ^ rLk and go to step la). 

2) Set k -(^ k + 1 and go to step 1). 

end 

We first show that for each outer iteration, the number of its inner iterations is finite. 
Theorem 3.7 For each k > 0, the inner termination criterion fj46|) is satisfied after at most 

- log(L,+c)-log(2L^i„) ^1 

logT 

Proof. Let Ga,e{-, ■) be defined in fl32l) . Since the objective function of fj45l) is strong convex 
with modulus L^, we have 

fiA + i Eti r > fiA + vf{xT{x'^' + ^ Er=i s'k^T + Lk\\x' 

k+l\ 1 Ap v^n fc+iiQ, I /r, _ ■f'/\||™fc+l _ rrk\\2 



X 



12' 



where the last inequality is due to (!37j) . This inequality together with the definition of Ga,e 
implies that 

G^A^\s') > G^^^,{x^+\s^) + {Lk-k^)\\x'^'-x''\\l. 

In addition, by the same arguments as in the proof of Theorem 13. 61 we have G^k+i (x'^^^, s^^^) < 
Ga,e*={.x^^^ , s^). By the definitions of and one can easily verify that ^^(x'^, s'') = 
Pa,e^{x^) foi' cill k. Combining these relations with the above inequality, we obtain that 

F„,,.+i(x'=+i) = G,.+i(x'=+\s'=+^) < G^A^^+\s^) < G„,,.(x^s'=)-(Lfc-^)||x'=+l 



X 



X' 



k\\2 



Hence, f H6l) holds whenever Lk > {Lf + c)/2. The rest of the proof is similar to that of 
Theorem 14.21 ■ 



We next show that any accumulation point of the sequence {x''} generated above is a 
first-order stationary point of problem ([1]). 

Theorem 3.8 Suppose that {e'^} is a sequence of non-increasing positive vectors in 9?" and 
— )■ as A; — 7- oo. Let the sequence {x''} be generated by the above IRLq method. Suppose 
that X* is an accumulation point of {x'^}. Then, x* is a stationary point of 1^, i.e., (E]) holds 
at X*. 
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Proof. Since Fa,^{x) > F{x) > f for every x G 3?", we see that {F^^^k{x'')} is bounded 
below. In addition, {F^^^k^x'')} is non- increasing due to die]). Hence, converges. 



which together with f H6|) imphes that ||x — a; || — )■ 0. Let Lk denote the final value of Lk 
at the kth outer iteration. By a similar argument as in the proof of Theorem 13. 3[ we can 
show that Lk G [L^in,T{Lf + c)/2). Let B = {i\x* ^ 0}. Since x* is an accumulation point 
of {x''}, there exists a subsequence K such that {x^^k — ^ a^*- By the definition of s'^, we see 
that lim s\ = \x*\^~". The first-order optimality condition of (H5|) with = yields 

^^^^ + Lu{x\^' - x\) + Ap.,^|x^r-' sgn(x^^) = 0, G 

Upon taking limits on both sides of the above equality as /c G i^T — )■ oo, and using the relation 

lim = Ix*!"^"", we have 
fceX— >oo 

+ \p\x*J-^ sgn(x*) = 0, G B. 

OXi 

Using this relation and a similar argument as in the proof of Theorem 12. II (i). we can conclude 
that X* satisfies (El). ■ 



4 New iterative reweighted li minimization for 

The IRLi and IRL2 methods studied in Section [3] require that the parameter e be dynami- 
cally adjusted and approach zero. One natural question is whether an iterative reweighted 
minimization method can be proposed for ([1]) that shares a similar convergence with those 
methods but does not need to adjust e. We will address this question by proposing a new 
IRLi method and its variant. 

As shown in Subsection 12.21 problem fl2Up has a locally Lipschitz continuous objective 
function and it is an e- approximation to ([1]). Moreover, when e is below a computable threshold 
value, a certain stationary point of f l20|) is also that of ([1]). In this section we propose new IRLi 
methods for solving ([1]), which can be viewed as the IRLi methods directly applied to problem 
(120|) . The novelty of these methods is in that the parameter e is chosen only once and then 
fixed throughout all iterations. Remarkably, we are able to establish that any accumulation 
point of the sequence generated by these methods is a first-order stationary point of ([T]). 

New IRLi minimization method for ([1]): 

Let q be defined in f|T3l) . Choose an arbitrary x^ G and e such that fl2Tl) holds. Set A; = 0. 
1) Solve the weighted Zi minimization problem 

x^^'^ G Argmin + Ap ^ | , (47) 

where = min | (3^)', |x^| 9^ | for all i. 
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2) Set k ^ k + 1 and go to step 1). 
end 



We next establish that any accumulation point of {x'^} generated by the above method is 
a first-order stationary point of ([T]). 

Theorem 4.1 Let the sequence {s'^} be generated by the above IRLi method. Assume that 
e satisfies f l2T]) . Suppose that x* is an accumulation point of {x''}. Then x* is a first-order 
stationary point of ([1]), i.e., ([6]) holds at x* . Moreover, the nonzero entries of x* satisfy the 
first-order bound ( JTOj) . 



(48) 



Proof Let = (3^)1/'? and 



An 

q 



G{x, s) = f{x) + Xp^ 



i=l 



11-^ 

\Xi Si 



By the definition of {s }, one can observe that for A; > 0, 

s'' = arg min G{x'',s), a;''+^ e ArgminG(x, s''). (49) 

0<s<Uf_ X 

In addition, we observe that F^{x) = min G{x, s) and F^{x^) = G{x^ , s^) for all fc, where 

0<S<Ue 

is defined in flTS]) . It then follows that 

F,{x^+^) = G{x^+\s''+') < G{x''+\s^) < G{x\s'') = F,{x''). (50) 

Hence, {^^(x'^)} is non- increasing. Since x* is an accumulation point of {x^}, there exists a 
subsequence K such that {x^}k — ^ x*. By the continuity of F^, we have {Ff^{x^)}K — ?• F^{x*), 
which together with the monotonicity of {F^{x^)} implies that F,{x'') F,{x*). Let s* = 
mm{ue, for all i. We then observe that {s''}k s* and F^{x*) = G{x*,s*). Using 

(!50|) and F^{x'') F^{x*), we see that G{x''~^^ , s'') F,(x*) = G{x*,s*). In addition, it 
follows from fHOl) that 

G{x,s^) > G{x^+\s^) Vx G 3ft". 
Upon taking limits on both sides of this inequality as A; G K — )■ 00, we have 

G{x,s*) > G{x*,s*) \/x G 3ft", 

that is, 

X* G Arg min < /(x) + Ap s* |xj | > . (51) 
The first-order optimality condition of (IHT]) yields 

OG -^ + Aps*sgn(x*), V^. (52) 
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Recall that s* = mm{ue, which together with f|T3l) implies that for all i, 

\x*\P-\ if|x*|>M«-S 
u^, if \x*\ < u1~^. 

Substituting it into (152!) and using (fTSl) . we obtain that 

OXi 

It then follows from (fT8|) that x* is a first-order stationary point of F^. In addition, by the 
monotonicity of {F^{x'^)} and F^{x^) — )■ ^^(a;*), we know that F^{x*) < F^[x^). Using these 
results and Theorem 12 .Tt we conclude that x* is a first-order stationary point of ([1]). The rest 
of conclusion immediately follows from Theorem I2.3[ ■ 

The above IRLi method needs to solve a sequence of reweighted li minimization problems 
(H7|) whose solution may not be cheaply computable. We next propose a variant of this method 
in which each subproblem has a closed form solution. 

A variant of new IRLi minimization method for ([1]): 

Let < Lmin < ivmax7 T > 1 and c > be given. Let q be defined in ( IT3|) . Choose an arbitrary 
x° and e such that fl2Tl) holds. Set A; = 0. 



1) Choose L° G [i^min, -^max] arbitrarily. Set Lk = L^. 
la) Solve the weighted li minimization problem 



e Argmin { f{x^) + Vf{x'')'^{x - x'') + —\\x - x''\\l + XpS^ Si\xi\ 



2 

j=l 



where s\ = min |()^)'', l^^^^h"^ | for all ^. 
lb) If 



(53) 



F,{x') - F,(x^+i) > - x'Wl (54) 

is satisfied, where F^ is defined in f[T5|) . then go to step 2). 
Ic) Set Lk ^ rLfc and go to step la). 

2) Set k -(^ k + 1 and go to step 1). 
end 

We first show that for each outer iteration, the number of its inner iterations is finite. 
Theorem 4.2 For each k > 0, the inner termination criterion fl54p is satisfied after at most 

log(Lf+c)-log(2L,„i„) 



+ 2 

logr 



inner iterations. 
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Proof. Let Lk denote the final value of Lk at the kth outer iteration. Since the objective 
function of ( 153|) is strongly convex with modulus L^., we have 

where the last inequality is due to (!37|) . This inequality together with (HHj) yields 



2 



X — a; 



12- 



Recall that F^{x) = min G{x,s) and F^{x^) = G{x'',s''), where = {j-Y^''- Using these 

0<s<Mf 

relations and the above inequality, we obtain that 



2 

^ II ^fc 1 1 2 



Hence, (15^ holds whenever > {Lf + c)/2. The rest of the proof is similar to that of 
Theorem 14.21 ■ 

We next establish that any accumulation point of the sequence {x^} generated above is a 
first-order stationary point of problem ([1]). 

Theorem 4.3 Let the sequence {x^} be generated by the above variant of new IRLi method. 
Assume that e satisfies (|2T]) . Suppose that x* is an accumulation point of {x^}. Then x* is a 
first-order stationary point of ([1]), i.e., ([6]) holds at x* . Moreover, the nonzero entries of x* 
satisfy the first-order bound ffTOj) . 



Proof It follows from §^ that {F,{x'')} is non-increasing. Since x* is an accumulation 
point of {x'^}, there exists a subsequence K such that {x^}k — ?• x*. By the continuity of F^, 
we have {F^{x'')}k — ?■ F^{x*), which together with the monotonicity of implies that 

F^{x'') — )■ F^{x*). Using this result and f lM|) . we can conclude that \\x''^^ — x''\\ — j- 0. Let 
Lk denote the final value of at the fcth outer iteration. By a similar argument as in the 
proof of Theorem 13. 3 [ one can show that G [Lmin, + c)/2). The first-order optimality 
condition of ( 153|) with = yields 

G + - x1) + Xps^ sgn(xf+i) = 0, V^. 

OXi 

Upon taking limits on both sides of the above equality as A; G -ftT — > oo, we have 

OG^^ + AKsgn(a:*), Vz, 
where s* = min{(^)^/'', |a;*|9-i} for all i. The rest of the proof is similar to that of Theorem 

m 
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5 Computational results 



In this section we conduct numerical experiment to compare the performance of the variants of 
IRLi methods proposed in Subsection 13 . II and 13.21 and Section HI In particular, we apply these 
methods to problem ([5]) whose data are randomly generated. For convenience of presentation, 
we name these variants as IRLi-1, IRLi-2 and IRLi-3, respectively. All codes are written in 
MATLAB and all computations are performed on a MacBook Pro running with Mac OS X 
Lion 10.7.4 and 4GB memory. 

For all three methods, we choose = le-8, Lmax = le+8, c = le-4, r = 1.1, and Lj] = 1. 
And we update by the similar strategy as used in spectral projected gradient method [1], 
that is. 



= max <^ Lmin, min <^ L^ax, 



lAxI 



where Ax = — x'^~^ and Ag = V f{x^) — V f{x^~'^). In addition, we choose = O.l'^e 
and 6k = O.l'^ for IRLi-1 and = 0.995^^6 for IRLi-2, respectively, where e is the all-ones 
vector. For IRLi-3, e is chosen to be the one satisfying fl2T]) but within 10~^ to the supremum 
of all e's satisfying fl^T]) . The same initial point is used for IRLi-1, IRLi-2 and IRLi-3. In 
particular, we choose x^ to be 

x^ G Argmin — 6|p + A||a;||i 

which can be computed by a variety of methods (e.g., [291 IB ISSl ISOj [31]). And all methods 
terminate according to the following criterion 

||XV/(x) + Ap|xnU < le-4, 

where X = Diag(a;). 

In the first experiment, we set A = 3e-3 for problem And the data A and b are 
randomly generated in the same manner as described in /i-magic In particular, given 
a > and positive integers m, n, T with m < n and T < n, we first generate a matrix 
W e sfinxm -^j^j-^ entries randomly chosen from a normal distribution with mean zero, variance 
one and standard deviation one. Then we compute an orthonormal basis, denoted by B, for 
the range space of W, and set A = . We also randomly generate a vector x G 9?" with 
only T nonzero components that are ±1, and generate a vector v G 9?'" with entries randomly 
chosen from a normal distribution with mean zero, variance one and standard deviation one. 
Finally, we set h = Ax + av. Especially, we choose a = 0.005 for all instances. 

The results of these methods for the above randomly generated instances with p = 0.1 
and 0.5 are presented in Tables [T] and [21 respectively. In detail, the parameters m and n of 
each instance are listed in the first two columns, respectively. The objective function value of 
problem (Q for these methods is given in columns three to five, and CPU times (in seconds) 
are given in the last three columns, respectively. We shall mention that the CPU time reported 
here does not include the time for obtaining initial point For p = 0.1, we observe from 
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Table 1: Comparison of three IRLi methods for problem ([5]) with p = 0.1 



Problem 


Objective Value 




UrU lime 




m 


n 




TT3 T O 

LixLi-Z 




TTD T 1 

IKLi-l 


TTD T O 

IKLi-z 


TTD T O 

IKLi-o 




CIO 

512 


n n/^ 1 o ^1 

0.061371 


0.061371 


n npi o^i 

0.061371 


0.02 


0.29 


0.01 


/4U 


1 no A 
1024 


0.122579 


0.122579 


0.122579 


0.01 


0.47 


f\ C\'\ 

0.01 


oou 


iOOU 


u. ioooyo 


u. ioooyo 


u. ioooyo 


0.01 


0.77 


0.01 


480 


2048 


0.245253 


0.245253 


0.245253 


n no 


i.40 


n no 


600 


2560 


0.305575 


0.305575 


0.305575 


0.03 


2.30 


0.03 


720 


3072 


0.367497 


0.367496 


0.347697 


0.04 


3.11 


0.04 


840 


3584 


0.429549 


0.429548 


0.429549 


0.05 


3.83 


0.06 


960 


4096 


0.489512 


0.489512 


0.489512 


0.06 


5.32 


0.08 


1080 


4608 


0.550911 


0.550911 


0.554911 


0.07 


6.59 


0.10 


1200 


5120 


0.611896 


0.611896 


0.611896 


0.10 


7.51 


0.13 



Table 2: Comparison of three IRLi methods for problem ^ with p = 0.5 



Problem 


Objective Value 




CPU Time 




m 


n 


IRLi-1 


IRLi-2 


IRLi-3 


IRLi-1 


IRLi-2 


IRLi-3 


120 


512 


0.061298 


0.062003 


0.061298 


0.02 


0.17 


0.01 


240 


1024 


0.122412 


0.123449 


0.122412 


0.01 


0.26 


0.01 


360 


1536 


0.183376 


0.184881 


0.183376 


0.01 


0.43 


0.01 


480 


2048 


0.244745 


0.247495 


0.244745 


0.02 


0.90 


0.02 


600 


2560 


0.304945 


0.306632 


0.304945 


0.03 


1.55 


0.03 


720 


3072 


0.366621 


0.370576 


0.366621 


0.03 


2.07 


0.04 


840 


3584 


0.429043 


0.433426 


0.429043 


0.04 


2.57 


0.06 


960 


4096 


0.488704 


0.492537 


0.488704 


0.05 


3.54 


0.08 


1080 


4608 


0.550031 


0.554057 


0.550031 


0.06 


4.40 


0.10 


1200 


5120 


0.610850 


0.615399 


0.610850 


0.07 


5.26 


0.12 



Table [T] that all three methods produce similar objective function values. The CPU time of 
IRLi-1 and IRLi-3 is very close, which is much less than that of IRLi-2. For p = 0.5, we see 
from Table [2] that IRLi-1 and IRLi-3 achieve better objective function values than IRLi-2 
while the former two methods require less CPU time. 

In the second experiment, we also randomly generate all instances for problem ([5]). In 
particular, we generate matrix A and vector h with entries randomly chosen from standard 
uniform distribution. In addition, we set A = 3e-3 for (I5l). The results of these methods for 
the above randomly generated instances with p = 0.1 and 0.5 are presented in Tables [3] and 
m respectively. Same as above, the CPU time reported here does not include the time for 
obtaining initial point . For p = 0.1, we observe from Table [3] that among IRLi-1, IRLi-2 
and IRLi-3 achieves best objective function values over 4, 4 and 3 instances out of total 10 
instances, respectively. The average CPU time of IRLi-3 is much less than that of IRLi-1 and 
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Table 3: Comparison of three IRLi methods for problem ([5]) with p = 0.1 



Problem 


Objective Value 




LrU lime 




m 


n 


TTD T 1 

IKLi-i 


TO T O 

IKLi-z 


TO T Q 

IKLi-o 


TO T 1 

IKLi-l 


TO T 


TO T 


120 


512 


0.6557 


0.6007 


0.6011 


1.25 


1.86 


0.93 


24U 


1 no /I 
1024 


1 imp 

1.1916 


1.2090 


1.2108 


1.96 


3.99 


2.16 


oou 


iOOO 


1 70/17 


i.oyoo 


i. ( zoo 


3.46 


7.34 


2.74 


480 


2048 


2.3025 


2.3112 


2.3270 


y.Do 


1/1 Q1 


( .00 


600 


2560 


2.7888 


2.7432 


2.7432 


13.50 


27.29 


20.90 


720 


3072 


3.3639 


3.4051 


3.4296 


19.96 


36.75 


21.51 


840 


3584 


3.7613 


3.7614 


3.7085 


24.26 


46.68 


36.10 


960 


4096 


4.4721 


4.2879 


4.2980 


60.26 


58.77 


47.98 


1080 


4608 


5.0258 


4.8848 


4.8649 


72.45 


69.51 


39.35 


1200 


5120 


5.2228 


5.3789 


5.3561 


83.99 


91.26 


57.97 



Table 4: Comparison of three IRLi methods for problem ^ with p = 0.5 



Problem 


Objective Value 




CPU Time 




m 


n 


IRLi-1 


IRLi-2 


IRLi-3 


IRLi-1 


IRLi-2 


IRLi-3 


120 


512 


0.2408 


0.2415 


0.2405 


2.12 


1.57 


0.99 


240 


1024 


0.4096 


0.4127 


0.4140 


7.10 


3.12 


2.42 


360 


1536 


0.5361 


0.5336 


0.5336 


21.50 


5.31 


4.04 


480 


2048 


0.6900 


0.6900 


0.6934 


34.93 


14.07 


9.95 


600 


2560 


0.7725 


0.7772 


0.7739 


61.08 


21.68 


25.49 


720 


3072 


0.9393 


0.9405 


0.9406 


259.72 


34.94 


35.55 


840 


3584 


1.0113 


1.007 


1.007 


313.30 


47.24 


39.43 


960 


4096 


1.1403 


1.1297 


1.1280 


533.36 


54.50 


52.90 


1080 


4608 


1.2178 


1.2186 


1.2220 


348.94 


77.42 


80.55 


1200 


5120 


1.3291 


1.3375 


1.3375 


835.89 


104.27 


114.99 



IRLi-2. For p = 0.5, all three methods achieve similar objective function values. The overall 
CPU time of IRLi-2 and IRLi-3 is very close, which is much less than that of IRLi-1. 

From the above two experiments, we observe that IRLi-3 is generally more stable than 
IRLi-1 and IRLi-2 in terms of objective function value and CPU time. 

6 Concluding remarks 

In this paper we studied iterative reweighted minimization methods for Ip regularized un- 
constrained minimization problems ([1]). In particular, we derived lower bounds for nonzero 
entries of first- and second-order stationary points, and hence also of local minimizers of ([1]). 
We extended some existing IRLi and IRL2 methods to solve ([1]) and proposed new variants 
for them. Also, we provided a unified convergence analysis for these methods. In addition. 
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we proposed a novel Lipschitz continuous e-approximation to ||a;||^. Using this result, we de- 
veloped new IRLi methods for and showed that any accumulation point of the sequence 
generated by these methods is a first-order stationary point of problem ([T]), provided that 
the approximation parameter e is below a computable threshold value. This is a remarkable 
result since all existing iterative reweighted minimization methods require that e be dynami- 
cally updated and approach zero. Our computational results demonstrate that the new IRLi 
method is generally more stable than the existing IRLi methods |211 [H] in terms of objective 
function value and CPU time. 

Recently, Zhao and Li [32] proposed an IRLi minimization method to identify sparse so- 
lutions to undetermined hnear systems based on a class of regularizers. When applied to the 
Ip regularizer, their method becomes one of the first type of IRLi methods discussed in Sub- 
section 13.11 Though we only studied the Ip regularized minimization problems, the techniques 
developed in our paper can be useful for analyzing the iterative reweighted minimization 
methods for the optimization problem with other regularizers. 
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